The FastMap Algorithm for Shortest Path Computations

نویسندگان

  • Liron Cohen
  • Tansel Uras
  • Shiva Jahangiri
  • Aliyah Arunasalam
  • Sven Koenig
  • T. K. Satish Kumar
چکیده

We present a new preprocessing algorithm for embedding the nodes of a given edge-weighted undirected graph into a Euclidean space. In this space, the Euclidean distance between any two nodes approximates the length of the shortest path between them in the given graph. Later, at runtime, a shortest path between any two nodes can be computed using A* search with the Euclidean distances as heuristic estimates. Our preprocessing algorithm, dubbed FastMap, is inspired by the Data Mining algorithm of the same name and runs in nearlinear time. Hence, FastMap is orders of magnitude faster than competing approaches that produce a Euclidean embedding using Semidefinite Programming. Our FastMap algorithm also produces admissible and consistent heuristics and therefore guarantees the generation of optimal paths. Moreover, FastMap works on general undirected graphs for which many traditional heuristics, such as the Manhattan Distance heuristic, are not always well defined. Empirically too, we demonstrate that the FastMap heuristic is competitive with other stateof-the-art heuristics like the Differential heuristic. Introduction and Related Work Shortest path problems commonly occur in the inner procedures of many AI programs. In video games, for example, a large fraction of CPU cycles are spent on shortest path computations [Uras and Koenig, 2015]. Many other tasks in AI, including motion planning [LaValle, 2006], temporal reasoning [Dechter, 2003], and decision making [Russell and Norvig, 2009], also involve finding and reasoning about shortest paths. While Dijkstra’s algorithm [Dijkstra, 1959] can be used to compute shortest paths in polynomial time, faster computations bear important implications on the time-efficiency of solving the aforementioned tasks. One way to boost shortest path computations is to use the A* search framework with an informed heuristic [Hart et al., 1968]. A perfect heuristic is one that returns the true shortest path distance between any two nodes in a given graph. In this graph, A* with such a heuristic and proper tie-breaking is guaranteed to expand nodes only on an optimal path between the specified start and goal nodes. In general, computing the perfect heuristic value between two nodes is as hard as computing the shortest path between them. Hence, A* search can benefit from a perfect heuristic only if it is computed offline. However, precomputing all pairwise shortest path distances is not only time-intensive but also requires a prohibitive O(N) memory where N is the number of nodes. Many methods for preprocessing a given graph (without precomputing all pairwise shortest path distances) have been studied before and can be grouped into several categories. Hierarchical abstractions that yield suboptimal paths have been used to reduce the size of the search space by abstracting groups of vertices [Botea et al., 2004; Sturtevant and Buro, 2005]. More informed heuristics [Björnsson and Halldórsson, 2006; Cazenave, 2006; Sturtevant et al., 2009] guide the searches better to expand fewer states. Hierarchies can also be used to derive heuristics during search [Leighton et al., 2008; Holte et al., 1994]. Dead-end detection and other pruning methods [Björnsson and Halldórsson, 2006; Goldenberg et al., 2010; Pochter et al., 2010] identify areas of the graph that do not need to be searched to find shortest paths. Search with contraction hierarchies [Geisberger et al., 2008] is an optimal and extremely hierarchical method, as every level of the hierarchy contains only a single node. It has been shown to be effective on road networks but seems to be less effective on graphs with higher branching factors, such as gridbased game maps [Storandt, 2013]. Another approach is that of N-level graphs [Uras and Koenig, 2014] constructed from undirected graphs by partitioning the nodes into levels. The hierarchy allows significant pruning during search. A different approach that does not rely on preprocessing the graph makes use of some notion of a geometric distance between two nodes as a heuristic estimate of the shortest path distance between them. One such common heuristic that is used in gridworlds is the Manhattan Distance heuristic.1 For many gridworlds, A* search with the Manhattan Distance heuristic outperforms Dijkstra’s algorithm. However, in complicated 2D/3D gridworlds like mazes, the Manhattan Distance heuristic may not be informed enough to efficiently guide A* search. Another issue associated with Manhattan In a 4-connected 2D gridworld, for example, the Manhattan Distance between two cells (x1, y1) and (x2, y2) is |x1−x2|+|y1−y2|. Similar generalizations exist for 3D and 8-connected gridworlds. ar X iv :1 70 6. 02 79 2v 2 [ cs .A I] 2 1 O ct 2 01 7 Distance-like heuristics is that they are not well defined for general graphs.2 For a graph that cannot be conceived in a geometric space, there is no closed-form formula for a “geometric” heuristic estimate for the distance between two nodes because there are no coordinates associated with them. For a graph that does not already have a geometric embedding in Euclidean space, a preprocessing algorithm can be used to generate one. As described before, at runtime, A* search would then use the Euclidean distance between any two nodes in this space as an estimate for the length of the shortest path between them in the given graph. One such approach is presented in [Rayner et al., 2011]. This approach guarantees admissiblility and consistency of the heuristic and therefore generates optimal paths. However, it requires solving a Semidefinite Program (SDP) in its preprocessing phase. SDPs can be solved in polynomial time [Vandenberghe and Boyd, 1996]; and in this case, additional structure is leveraged to solve them in cubic time [Rayner et al., 2011]. Still, a cubic preprocessing time limits the size of the graphs that are amenable to this approach. The Differential heuristic is another state-of-the-art approach that has the benefits of near-linear preprocessing time. However, unlike the approach in [Rayner et al., 2011], it does not produce an explicit Euclidean embedding. In the preprocessing phase of the Differential heuristic approach, some nodes of the graph are chosen as pivot nodes. The shortest path distances between each pivot node and every other node are precomputed and stored [Sturtevant et al., 2009]. At runtime, the heuristic distance between two nodes, a and b, is given by maxp |d(a, p)− d(p, b)| where p is a pivot node and d(, ) is the precomputed distance. The preprocessing time is linear in the number of pivots times the size of the graph. The required space is linear in the number of pivots times the number of nodes, although a more succinct representation is presented in [Goldenberg et al., 2011]. Similar preprocessing techniques are used in Portal-Based True Distance heuristics [Goldenberg et al., 2010]. In this paper, we present a new preprocessing algorithm that produces an explicit Euclidean embedding while running in near-linear time. It therefore has the benefits of the Differential heuristic’s preprocessing time as well as that of producing an embedding from which heuristic estimates can be quickly computed using closed-form formulas. Our preprocessing algorithm, dubbed FastMap, is inspired by the Data Mining algorithm of the same name [Faloutsos and Lin, 1995]. It is orders of magnitude faster than SDP-based approaches for producing Euclidean embeddings. FastMap also produces admissible and consistent heuristics and therefore guarantees the generation of optimal paths. In comparison to other heuristics derived from closed-form formulas, like the Manhattan or the Octile Distance heuristics, the FastMap heuristic has several advantages. First, it is defined for general undirected graphs (even if they are not gridworlds). Second, we observe empirically that even in gridworlds, A* with the FastMap heuristic outperforms A* with the Manhattan or the Octile Distance heuristic. In comHenceforth, whenever we refer to a graph, we mean an edgeweighted undirected graph unless stated otherwise. parison to the Differential heuristic with the same memory resources, the FastMap heuristic is not only competitive with it on some graphs but even outperforms it on some others. This performance of FastMap is encouraging given that it produces an explicit Euclidean embedding that has other representational benefits like recovering the underlying manifolds of the graph and/or visualizing it. Moreover, we observe that the FastMap and the Differential heuristics have complementary strengths and can be easily combined to generate a more informed heuristic. The Origin of FastMap The FastMap algorithm [Faloutsos and Lin, 1995] was introduced in the Data Mining community for automatically generating geometric embeddings of abstract objects. For example, if we are given objects in the form of long DNA strings, multimedia datasets such as voice excerpts or images, or medical datasets such as ECGs or MRIs, there is no geometric space in which these objects can be naturally visualized. However, in many of these domains, there is still a well defined distance function between every pair of objects. For example, given two DNA strings, the edit distance between them3 is well defined although an individual DNA string cannot be conceptualized in a geometric space. Clustering techniques, such as the k-means algorithm, are well studied in Machine Learning [Alpaydin, 2010]; but they cannot be applied directly to domains with abstract objects as described above. This is because these algorithms assume that the objects are described as points in a geometric space. FastMap revives the applicability of these clustering techniques by first creating an artificial Euclidean embedding for the abstract objects. The Euclidean embedding is such that the pairwise distances are approximately preserved. Such an embedding would also help in the visualization of the abstract objects. This visualization, for example, can aid physicians in identifying correlations between symptoms or other patterns from medical records. We are given a complete undirected edge-weighted graph G = (V,E). Each vertex vi ∈ V represents an abstract object Oi. Between any two vertices, vi and vj , there is an edge (vi, vj) ∈ E with weight D(Oi, Oj). Here, D(Oi, Oj) is the given pairwise distance between the objects Oi and Oj . A Euclidean embedding assigns to each object Oi a Kdimensional point pi ∈ R . A good Euclidean embedding is one in which the Euclidean distance between any two points, pi and pj , closely approximates D(Oi, Oj). One of the early approaches for generating such an embedding was based on the idea of multi-dimensional scaling (MDS) [Torgerson, 1952]. Here, overall distortion of the pairwise distances is measured in terms of the “energy” stored in “springs” connecting each pair of objects. MDS, however, requires O(N) time (N = |V |) and hence does not scale well in practice. On the other hand, FastMap [Faloutsos and Lin, 1995] requires only linear time. Both methods embed the objects in a K-dimensional space for a user-specified K. The edit distance between two strings is the minimum number of insertions, deletions or substitutions that are needed to transform one to the other.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for the Discrete Shortest Path Problem in a Network Based on Ideal Fuzzy Sets

A shortest path problem is a practical issue in networks for real-world situations. This paper addresses the fuzzy shortest path (FSP) problem to obtain the best fuzzy path among fuzzy paths sets. For this purpose, a new efficient algorithm is introduced based on a new definition of ideal fuzzy sets (IFSs) in order to determine the fuzzy shortest path. Moreover, this algorithm is developed for ...

متن کامل

Fast Shortest Path Computation for Solving the Multicommodity Flow Problem

For solving the multicommodity flow problems, Lagrangian relaxation based algorithms are fast in practice. The time-consuming part of the algorithms is the shortest path computations in solving the Lagrangian dual problem. We show that an A* search based algorithm is faster than Dijkstra’s algorithm for the shortest path computations when the number of demands is relatively smaller than the siz...

متن کامل

Two optimal algorithms for finding bi-directional shortest path design problem in a block layout

In this paper, Shortest Path Design Problem (SPDP) in which the path is incident to all cells is considered. The bi-directional path is one of the known types of configuration of networks for Automated Guided Vehi-cles (AGV).To solve this problem, two algorithms are developed. For each algorithm an Integer Linear Pro-gramming (ILP) is determined. The objective functions of both algorithms are t...

متن کامل

Acceleration of Shortest Path and Constrained Shortest Path Computation

We study acceleration methods for point-to-point shortest path and constrained shortest path computations in directed graphs, in particular in road and railroad networks. Our acceleration methods are allowed to use a preprocessing of the network data to create auxiliary information which is then used to speed-up shortest path queries. We focus on two methods based on Dijkstra’s algorithm for sh...

متن کامل

Finding the nearest facility for travel and waiting time in a transport network

One of user's queries from navigation service is to find the nearest facility in terms of time. The facility that is being questioned by the user as a destination may have a queuing service system (e.g. bank), which means that the cost function of the shortest path includes the waiting time at the destination as well as the travel time. This research conducts in the zone 1 of Mashhad with Bank ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1706.02792  شماره 

صفحات  -

تاریخ انتشار 2018